Heterogeneity Considered Harmful to Algorithm Designers
نویسندگان
چکیده
In this paper, we deal with algorithmic issues on heterogeneous platforms. We show that static scheduling and load-balancing strategies are absolutely needed to achieve good performances, in contrast to situation for homogeneous parallel machines where dynamic schemes often turn out to be very satisfactory. However, we also show that static strategies targeted to heterogeneous platforms are difficult to design and implement: intuitively, data distribution must obey a much more refined model than standard block-cyclic distributions to equally balance the load between processors of different speeds. Technically, we state several NP-completeness results that demonstrate the intrinsic difficulty of static load-balancing on heterogeneous platforms. 1 Matrix Product on 2D Homogeneous Grids We shortly describe the parallel matrix multiplication algorithm on a 2D homogeneous grid which is used in Scalapack [4]. The , and matrices are identically partitioned into rectangles. There is a one-to-one mapping between these rectangles and the processors. Each processor is responsible for updating its rectangle: more precisely, at step , the -th column of is horizontally broadcasted, the -th row of is vertically broadcasted, and each processor updates its rectangle with the product of (fragments of) these column and row as soon as received. As depicted in Figure 1, the total volume of data exchanged is proportional to the sum of the half perimeters of the rectangles if the underlying communication network does not allow to perform communications in parallel. The work is perfectly balanced: at each step, each processor processes the same amount of data at the same speed as the others. LIP, UMR CNRS–ENS Lyon–INRIA 5668, Ecole Normale Suprieure de Lyon, 46, Allée d’Italie, 69364 Lyon Cedex 07, France. E-mail: [email protected] A B Figure 1. homogeneous grid 2 Matrix Product on Heterogeneous Platforms How to modify the previous algorithm for a heterogeneous platform? The idea is to keep the same framework: at each step, one pivot column and one pivot row are communicated to all processors, and independent updates take place. However, with different-speed processors, we cannot distribute same size rectangles from the matrix to the processors. Intuitively, we want to balance the computing load so that each processor receives an amount of work in accordance to its computing power. Because all blocks require the same amount of arithmetic operations, each processor executes an amount of work which is proportional to the number of blocks that are allocated to it, hence proportional to the area of its rectangle. Thus, parallelizing the matrix-matrix product on a heterogeneous platforms turns out to be equivalent to partition the unit square into rectangles of prescribed area while minimizing a cost function based on half-perimeters of these rectangles. 3 Rectangle Partitioning Different partition types are summarized in Figure 2. We present complexity results on problems equivalent to matrix distribution on heterogeneous platforms with network like Ethernet ( ) or Myrinet ( ) (see Table 1). It is well-known that 1D partitions are not scalable since the volume of communication grows proportionally to the number of machines. Adapting this kind of distribution to heterogeneous platforms is fairly easy but still leads to non-scalable distributions. Nevertheless designing heterogeneous 2D distributions turns out to be more Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER’00) 0-7695-0896-0/00 $ 17.00 © 2000 IEEE 1D 2D (Grid) Column−based Recursive Without constraint Figure 2. Partitioning the unit square: a taxonomy 1D 2D Column-based Recursive General Polynomial [2] No known results so far NP-hard. Guaranteed heuristic ( from the optimum). [2] Polynomial NP-hard [1] NP-hard ([1]). Guaranteed heuristic ( from the optimum). No known results so far NP-hard. Guaranteed heuristic ( from the optimum). [3] Table 1. Various complexity results for rectangle partitionning difficult: Heterogeneous 2D Grid : Given real positive numbers s.t. is there a heterogeneous grid partition of the unit square into rectangles of size s.t. , and a one-to-one mapping from to minimizing . This optimisation problem is NP-hard (see [1] for more details). The communication pattern is simpler for 2D grids than for more general processor arrangements, but they do not allow for a perfectly balanced workload in general. Column-based partitions (with a different number of processors in each column) are always perfectly balanced, often close to the optimum solution and generally lead to a rather simple communication pattern. Distributing a matrix columnwise on a heterogeneous platform made of different processors linked by a homogeneous network (like for example Myrinet) turns out to be equivalent to solving the following problem, where the are the processor speeds: Col-Peri-Max : Given real positive numbers s.t. , find a column-based partition of the unit square into rectangles of area and of size , so that is minimized. This optimisation problem is NP-hard, but we have provided a guaranteed heuristic (see [1] for more details). 4 Conclusion We have stated different complexity results on the static load-balancing problem on heterogeneous platforms. The practical applicability of our theoretical results has been demonstrated through a series of experiments on one HNOW with Fast Ethernet and on another one with Myrinet/Bip as communication network. Our current work is focused on the design and the implementation of efficient on-the-fly redistribution to cope with potential variations of the machin loads.
منابع مشابه
Pareto Optimization of a Two-degree of Freedom Passive Linear Suspension Using a New Multi-objective Genetic Algorithm (TECHNICAL NOTE)
The primary function of a suspension system of a vehicle is to isolate the road excitations experienced by the tires from being transmitted to the passengers. In this paper, we formulate an optimal vehicle suspension design problem with the quarter-car vehicle dynamic model. A new multi-objective genetic algorithm is used for Pareto optimization of a two-degree of freedom vehicle vibration mode...
متن کاملDynamics of Love-Type Waves in Orthotropic Layer Under the Influence of Heterogeneity and Corrugation
The present problem deals with the propagation of Love-type surface waves in a bedded structure comprises of an inhomogeneous orthotropic layer and an elastic half-space. The upper boundary and the interface between two media are considered to be corrugated. An analytical method (separation of variables) is adapted to solve the second order PDEs, which governs the equations of motion. Equations...
متن کاملTransference of SH-Waves in Fluid Saturated Porous Medium Sandwiched Between Heterogeneous Half-Spaces
A mathematical model is considered to investigate the behavior of horizontally polarized shear waves (SH-waves) in fluid saturated porous medium sandwiched between heterogeneous half-spaces. Heterogeneity in the upper half-space is due to linear variation of elastic parameters, whereas quadratic variation has been considered for lower half-space. The method of separation of variables and Whitta...
متن کاملMultiband Sensing for Area Recovery
Cognitive radios are being proposed for recovering unused spectrum holes in time and space. Unfortunately the metric of detection-sensitivity while being useful for algorithm designers is not indicative of a system’s ability to recover spectrum holes (especially spectrum holes in space). In this paper we extend the safety and performance metrics of Fear of Harmful Interference (FHI ) and Probab...
متن کاملThe effect of anisotropy and heterogeneity of soils on slopes stability using numerical method in undrained condition
Naturally formation of soil deposits leads to anisotropy and heterogeneity in their properties. The aim of this study is modeling of soil having anisotropic and heterogeneous strength using finite difference method to estimate the effect of these on slope stability in undrained condition. In this study the soil cohesion is considered as an anisotropic and heterogeneous variable and assumed to h...
متن کاملInvestigating the Effect of Heterogeneity on Buckley-Leverett Flow Model
The performance of water flooding can be investigated by using either detail numerical modeling or simulation, or simply through the analytical Buckley-Leverett (BL) model. The Buckley-Leverett analytical technique can be applied to one-dimensional homogeneous systems. In this paper, the impact of heterogeneity on water flooding performance and fractional flow curve is investigated. First, a ba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000